Organisms homozygous for targeted modification

ABSTRACT

Disclosed herein are homozygously modified organisms and methods of making and using these organisms.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a continuation of U.S. patent applicationSer. No. 12/806,396, filed Aug. 11, 2010 which claims the benefit ofU.S. Provisional Application No. 61/273,928, filed Aug. 11, 2009, thedisclosures of which are hereby incorporated by reference in theirentireties.

STATEMENT OF RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSOREDRESEARCH

Not applicable.

TECHNICAL FIELD

The present invention relates to organisms which are homozygous fortargeted modification of one or more endogenous genes. Morespecifically, the invention concerns organisms (e.g., plants or animals)in which both alleles of a gene are disrupted but in which thehomozygous knockout organism does not contain exogenous sequences at thedisrupted locus. The invention also concerns organisms (e.g., plants oranimals) in which both alleles of a gene are modified by insertion of atransgene, wherein the transgene lacks sequences encoding a reporter(e.g., selectable marker).

BACKGROUND

Organisms (e.g., plants and animals) with homozygous targeted genemodifications are useful in a wide variety of agricultural,pharmaceutical and biotechnology applications. These organisms havetraditionally been generated by inducing homologous recombination of adesired sequence (donor) at the gene selected for modification. However,in order to select for cells which have the incorporated the donor DNAinto the targeted locus, the targeting vector must include both positiveand negative selection markers. See, e.g., U.S. Pat. No. 5,464,764. Theselected cells produce heterozygotes that must be crossed to obtainorganisms homozygous for the gene modification. Throughout the process,the selection markers remain integrated in the organism's genome suchthat the resulting modified homozygote includes both the modified geneand exogenous (e.g., marker) sequences.

Recently, nucleases, including zinc finger nucleases and homingendonucleases such as I-SceI, that are engineered to specifically bindto target sites have been successfully used for genome modification in avariety of different species. See, for example, United States PatentPublications 20030232410; 20050208489; 20050026157; 20050064474;20060188987; 20060063231; 2008/0182332; 2009/0111188, and InternationalPublication WO 07/014,275, the disclosures of which are incorporated byreference in their entireties for all purposes. These ZFNs can be usedto create a double-strand break (DSB) in a target nucleotide sequence,which increases the frequency of donor nucleic acid introduction viahomologous recombination at the targeted locus (targeted integration)more than 1000-fold. In addition, the inaccurate repair of asite-specific DSB by non-homologous end joining (NHEJ) can also resultin targeted gene disruption.

Nonetheless, as with non-nuclease methods, in order to readily identifynuclease-mediated modifications in many organisms, an exogenous DNAincluding a selection marker or reporter gene is also targeted to theselected locus. See, e.g., Shukla et al. (2009) Nature459(7245):437-441; U.S. Patent Publication Nos. 2008/0182332 and2009/0111188. While targeted integration of a reporter allows foridentification of modifications for a number of applications, thistechnique is not always desirable as it leaves additional exogenousnucleic acid sequences inserted into the genome.

Thus, there remains a need for compositions and methods for generatinghomozygous organisms modified at a desired gene locus, includinghomozygous KO organisms without inserted exogenous sequences at thelocus (loci) targeted for modification and homozygous transgenicorganisms without sequences encoding reporters such as selectablemarkers.

SUMMARY

Described herein are homozygous organisms comprising a modification at adesired gene locus as well as methods and systems for generating theseorganisms. Modified organisms include homozygous KO organisms withoutinserted exogenous sequences at the locus (loci) targeted formodification and homozygous transgenic organisms without sequencesencoding reporters (e.g., selectable markers).

In one aspect, provided herein is a modified organism comprising atleast one gene locus in which both alleles of the locus are modified(e.g., disrupted), but wherein the modified organism does not compriseexogenous sequences at the modified locus. In other embodiments, theorganisms as described herein may comprise one or more transgenes(exogenous sequences) at any locus that is not disrupted (knocked out).

In another aspect, provided herein is a modified organism comprising atleast one gene locus in which both alleles of the locus comprise atransgene, wherein the transgene does not comprise a reporter such as ascreening or selectable marker. In a further aspect, provided herein isa modified organism comprising at least one gene locus in which allalleles (e.g., in a tri- or tetraploid organism) of the locus comprise atransgene, wherein the transgene does not comprise a reporter such as ascreening or selectable marker.

Any of the organisms described herein may comprise more than onebi-allelic (or multi-allelic) modification (e.g., disruption ortransgene). Furthermore, the organism may be for example, a eukaryote(e.g., a plant or an animal such as a mammal, such as a rat, mouse, orfish).

In yet another aspect, provided herein is a method of generating ahomozygous (bi-allelic) knockout organism lacking exogenous sequences,the method comprising, introducing an exogenous sequence (e.g.,reporter) into a cell using a nuclease that mediates targetedintegration of the exogenous sequence into a selected locus of thegenome, identifying cells in which the exogenous sequence has beenintroduced into one allele of the target locus (mono-allelic TI cells),identifying mono-allelic TI cells comprising a NHEJ deletion at theother allele (TI/NHEJ clones), allowing the TI/NHEJ clones to develop toreproductive maturity, crossing the TI/NHEJ organisms to each other (orin the case of plants, also allowing the organism to “self”), andidentifying progeny that exhibit bi-allelic NHEJ modifications, therebygenerating a bi-allelic knockout organism lacking exogenous sequences atthe target locus. In yet another aspect, provided herein is a method ofgenerating a homozygous (bi-allelic) organism comprising desiredtransgene sequences lacking sequences encoding a reporter (e.g.,selectable marker), the method comprising, introducing an exogenousreporter sequence into a cell using a nuclease that mediates targetedintegration of the reporter exogenous sequence into a selected locus ofthe genome, introducing the desired transgene sequence(s) into a cellwherein the nuclease mediates targeted integration of the transgenesequence into the selected locus of the genome, identifying cells inwhich the exogenous reporter sequence has been introduced into oneallele of the target locus (mono-allelic reporter-TI cells), identifyingmono-allelic reporter-TI cells comprising a transgene insertion at theother allele (reporter-TI/transgene clones), allowing thereporter-TI/transgene clones to develop to reproductive maturity,crossing the reporter-TI/transgene organisms to each other (or in thecase of plants, also allowing the organism to “self”), and identifyingprogeny that exhibit bi-allelic transgene insertions, thereby generatinga bi-allelic organism comprising the desired transgene but lackingreporter sequences at the target locus.

In certain embodiments, the nuclease comprises one or more zinc fingernucleases (ZFN). In other embodiments, the nuclease comprises a homingendonuclease or meganuclease, or a TAL-effector domain nuclease fusion(“TALEN”). In any of the embodiments described herein, the exogenoussequence (e.g., exogenous reporter sequence) and transgene may beintroduced concurrently or sequentially with the nuclease(s). In someaspects, the exogenous sequence comprises a reporter gene such as aselectable marker (e.g., an herbicide resistant gene for plants) or ascreening marker (e.g. a fluorescent protein). Any of the methodsdescribed herein may be repeated to generate organisms that arehomozygous KOs or contain homozygous transgene insertions at multipleloci. It will be apparent that any of the methods described herein canbe applied to polyploid organisms (e.g., by repeating the steps) thatinclude more than two alleles, for example, tri- or tetraploid plants.

In another aspect, the invention provides kits that are useful forgenerating organisms with homozygous targeted gene modifications withoutinserted reporter (e.g screening or selection) sequences. The kitstypically include one or more nucleases (or polynucleotides encoding thenuclease) that bind to a target site (the selected locus formodification), optional cells containing the target site(s) of thenuclease, an exogenous sequence for targeted integration, an optionaldonor transgene comprising sequences homologous to the target site, andinstructions for (i) introducing the nucleases and exogenous sequenceinto the cells; (ii) identifying cells into which the exogenoussequences are inserted into an allele at the target locus; (iii)identifying cells having mono-allelic targeted integration of theexogenous reporter sequence and modifications at the other allele of thelocus (reporter-TI/modified cells); (iv) growing/developing selectedcells into reproductively mature organisms; (v) crossing thereporter-TI/modified heterozygous organisms; (vi) identifying progeny ofthe reporter-TI/modified crosses that are bi-allelic for the targetedgene modification. These steps may be repeated in polyploid organisms tomodify all alleles as desired.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 (SEQ ID NOS:13-14) depicts sequence analysis of the non-TI alleleof a ZFN-TI-modified IPK1 chromatid from Zea mays. Underlined base pairsshow the binding sites for ZFN pair used for genomic modifications. The“:” indicate deleted bases. The wild-type sequence is indicated in thefirst line and the multiple sequence reads of the sequenced non-TIallele event are shown below.

FIG. 2 depicts the scheme for introducing a fluorescent protein(enhanced yellow fluorescent protein (EYFP)) into the 3′ untranslatedregion of the murine histone H3.3B gene. The top line shows theschematic of the H3.3B gene in the murine genome on chromosome 11 andshows the target site on the gene sequence for the H3.3B-specific ZFN.The second line depicts the donor nucleotide (“targeting construct”)comprising a H3.3B gene linked to a EYFP sequence where the EYFP hasbeen inserted at the 5′ end of the 3′ untranslated region. The bottomline depicts the insertion of the donor nucleotide into the H3.3B locusin the murine genome.

FIG. 3 depicts the results from FACS and Southern blot analysisdemonstrating the heterozygous integration of the EYFP transgene intomurine ES cells. FIG. 3A depicts the FACS results of ES cells lackingthe inserted EYFP gene sequence in the H3.3B locus (top panel) and thoseresults for cells that received the H3.3B-EYFP insertion. FIG. 3Bdepicts a Southern blot derived from genomic DNA of cells that have theinserted H3.3B-EYFP sequence versus wildtype cells.

FIG. 4 (SEQ ID NOS: 15-22) depicts the sequences from 19 non-reporteralleles demonstrating NHEJ in the H3.3B-EYFP/heterozygotes. Underlinedbase pairs show the binding sites for the ZFN pair used for themodifications. The “−” indicate deleted bases or spaces in the sequenceto allow for alignment with the clones containing insertions.

DETAILED DESCRIPTION

Described herein are homozygously modified organisms, including knockout(KO) organisms with no added genetic material at either allele oftargeted locus and knock-in organisms that include transgenes ofinterest, but in which the transgene of interest lacks sequencesencoding a reporter such as selectable marker. Also described aremethods of generating these modified organisms. In particular, theorganisms typically have modifications that alter gene function at bothalleles. These organisms are generated by providing cells from theorganism of interest, using nucleases to insert an exogenous reportersequence (e.g., screening or selectable marker) via targeted integration(TI) into an allele at a selected locus in the cell, identifying cellsin which the exogenous reporter sequence was inserted into an allele atthe selected locus, screening the mono-allelic reporter-TI clones formodification events at the second allele of the locus to identify cellswith one reporter TI allele and in which the other allele is modified byNHEJ (reporter TI/NHEJ) or in which the other allele comprises anon-reporter marker transgene (reporter-TI/modified clones), allowingthe reporter-TI/modified clones to develop to reproductively matureorganisms, crossing the reporter-TI/modified organisms, and identifyingprogeny of the crosses that are biallelic knockout (NHEJ/NHEJ) orbiallelic non-reporter marker knock-in (non-reporter TI/non-reportermarker TI) organisms.

General

Practice of the methods, as well as preparation and use of thecompositions disclosed herein employ, unless otherwise indicated,conventional techniques in molecular biology, biochemistry, chromatinstructure and analysis, computational chemistry, cell culture,recombinant DNA and related fields as are within the skill of the art.These techniques are fully explained in the literature. See, forexample, Sambrook et al. MOLECULAR CLONING: A LABORATORY MANUAL, Secondedition, Cold Spring Harbor Laboratory Press, 1989 and Third edition,2001; Ausubel et al., CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley& Sons, New York, 1987 and periodic updates; the series METHODS INENZYMOLOGY, Academic Press, San Diego; Wolfe, CHROMATIN STRUCTURE ANDFUNCTION, Third edition, Academic Press, San Diego, 1998; METHODS INENZYMOLOGY, Vol. 304, “Chromatin” (P. M. Wassarman and A. P. Wolffe,eds.), Academic Press, San Diego, 1999; and METHODS IN MOLECULARBIOLOGY, Vol. 119, “Chromatin Protocols” (P. B. Becker, ed.) HumanaPress, Totowa, 1999.

DEFINITIONS

The terms “nucleic acid,” “polynucleotide,” and “oligonucleotide” areused interchangeably and refer to a deoxyribonucleotide orribonucleotide polymer, in linear or circular conformation, and ineither single- or double-stranded form. For the purposes of the presentdisclosure, these terms are not to be construed as limiting with respectto the length of a polymer. The terms can encompass known analogues ofnatural nucleotides, as well as nucleotides that are modified in thebase, sugar and/or phosphate moieties (e.g., phosphorothioatebackbones). In general, an analogue of a particular nucleotide has thesame base-pairing specificity; i.e., an analogue of A will base-pairwith T.

The terms “polypeptide,” “peptide” and “protein” are usedinterchangeably to refer to a polymer of amino acid residues. The termalso applies to amino acid polymers in which one or more amino acids arechemical analogues or modified derivatives of correspondingnaturally-occurring amino acids.

“Binding” refers to a sequence-specific, non-covalent interactionbetween macromolecules (e.g., between a protein and a nucleic acid). Notall components of a binding interaction need be sequence-specific (e.g.,contacts with phosphate residues in a DNA backbone), as long as theinteraction as a whole is sequence-specific. Such interactions aregenerally characterized by a dissociation constant (K_(d)) of 10⁻⁶ M⁻¹or lower. “Affinity” refers to the strength of binding: increasedbinding affinity being correlated with a lower K_(d).

A “binding protein” is a protein that is able to bind non-covalently toanother molecule. A binding protein can bind to, for example, a DNAmolecule (a DNA-binding protein), an RNA molecule (an RNA-bindingprotein) and/or a protein molecule (a protein-binding protein). In thecase of a protein-binding protein, it can bind to itself (to formhomodimers, homotrimers, etc.) and/or it can bind to one or moremolecules of a different protein or proteins. A binding protein can havemore than one type of binding activity. For example, zinc fingerproteins have DNA-binding, RNA-binding and protein-binding activity.

A “TAL-effector DNA binding domain” is a protein, or a domain within alarger protein, that interacts with DNA in a sequence-specific mannerthrough one or more tandem repeat domains.

A “zinc finger DNA binding protein” (or binding domain) is a protein, ora domain within a larger protein, that binds DNA in a sequence-specificmanner through one or more zinc fingers, which are regions of amino acidsequence within the binding domain whose structure is stabilized throughcoordination of a zinc ion. The term zinc finger DNA binding protein isoften abbreviated as zinc finger protein or ZFP.

Zinc finger binding domains (e.g., the recognition helix region) can be“engineered” to bind to a predetermined nucleotide sequence. Theengineered region of the zinc finger is typically the recognition helix,particularly the portion of the alpha-helical region numbered −1 to +6.Backbone sequences for an engineered recognition helix are known in theart. See, e.g., Miller et al. (2007) Nat Biotechnol 25, 778-785.Non-limiting examples of methods for engineering zinc finger proteinsare design and selection. A designed zinc finger protein is a proteinnot occurring in nature whose design/composition results principallyfrom rational criteria. Rational criteria for design include applicationof substitution rules and computerized algorithms for processinginformation in a database storing information of existing ZFP designsand binding data. See, for example, U.S. Pat. Nos. 6,140,081; 6,453,242;and 6,534,261; see, also WO 98/53058; WO 98/53059; WO 98/53060; WO02/016536 and WO 03/016496.

A “selected” zinc finger protein is a protein not found in nature whoseproduction results primarily from an empirical process such as phagedisplay, interaction trap or hybrid selection. See e.g., U.S. Pat. Nos.5,789,538; 5,925,523; 6,007,988; 6,013,453; 6,200,759; and InternationalPatent Publication Nos. WO 95/19431; WO 96/06166; WO 98/53057; WO98/54311; WO 00/27878; WO 01/60970; WO 01/88197 and WO 02/099084.

The term “sequence” refers to a nucleotide sequence of any length, whichcan be DNA or RNA; can be linear, circular or branched and can be eithersingle-stranded or double stranded. The term “donor sequence” refers toa nucleotide sequence that is inserted into a genome. A donor sequencecan be of any length, for example between 2 and 10,000 nucleotides inlength (or any integer value therebetween or thereabove), preferablybetween about 100 and 1,000 nucleotides in length (or any integertherebetween), more preferably between about 200 and 500 nucleotides inlength.

A “homologous, non-identical sequence” refers to a first sequence whichshares a degree of sequence identity with a second sequence, but whosesequence is not identical to that of the second sequence. For example, apolynucleotide comprising the wild-type sequence of a mutant gene ishomologous and non-identical to the sequence of the mutant gene. Incertain embodiments, the degree of homology between the two sequences issufficient to allow homologous recombination therebetween, utilizingnormal cellular mechanisms. Two homologous non-identical sequences canbe any length and their degree of non-homology can be as small as asingle nucleotide (e.g., for correction of a genomic point mutation bytargeted homologous recombination) or as large as 10 or more kilobases(e.g., for insertion of a gene at a predetermined ectopic site in achromosome). Two polynucleotides comprising the homologous non-identicalsequences need not be the same length. For example, an exogenouspolynucleotide (i.e., donor polynucleotide) of between 20 and 10,000nucleotides or nucleotide pairs can be used.

Techniques for determining nucleic acid and amino acid sequence identityare known in the art. Typically, such techniques include determining thenucleotide sequence of the mRNA for a gene and/or determining the aminoacid sequence encoded thereby, and comparing these sequences to a secondnucleotide or amino acid sequence. Genomic sequences can also bedetermined and compared in this fashion. In general, identity refers toan exact nucleotide-to-nucleotide or amino acid-to-amino acidcorrespondence of two polynucleotides or polypeptide sequences,respectively. Two or more sequences (polynucleotide or amino acid) canbe compared by determining their percent identity. The percent identityof two sequences, whether nucleic acid or amino acid sequences, is thenumber of exact matches between two aligned sequences divided by thelength of the shorter sequences and multiplied by 100.

Alternatively, the degree of sequence similarity between polynucleotidescan be determined by hybridization of polynucleotides under conditionsthat allow formation of stable duplexes between homologous regions,followed by digestion with single-stranded-specific nuclease(s), andsize determination of the digested fragments. Two nucleic acid, or twopolypeptide sequences are substantially homologous to each other whenthe sequences exhibit at least about 70%-75%, preferably 80%-82%, morepreferably 85%-90%, even more preferably 92%, still more preferably 95%,and most preferably 98% sequence identity over a defined length of themolecules, as determined using the methods-above. As used herein,substantially homologous also refers to sequences showing completeidentity to a specified DNA or polypeptide sequence. DNA sequences thatare substantially homologous can be identified in a Southernhybridization experiment under, for example, stringent conditions, asdefined for that particular system. Defining appropriate hybridizationconditions is within the skill of the art. See, e.g., Sambrook et al.,supra; Nucleic Acid Hybridization: A Practical Approach, editors B. D.Hames and S. J. Higgins, (1985) Oxford; Washington, D.C.; IRL Press).

“Recombination” refers to a process of exchange of genetic informationbetween two polynucleotides. For the purposes of this disclosure,“homologous recombination (HR)” refers to the specialized form of suchexchange that takes place, for example, during repair of double-strandbreaks in cells via homology-directed repair mechanisms. This processrequires nucleotide sequence homology, uses a “donor” molecule totemplate repair of a “target” molecule (i.e., the one that experiencedthe double-strand break), and is variously known as “non-crossover geneconversion” or “short tract gene conversion,” because it leads to thetransfer of genetic information from the donor to the target. Withoutwishing to be bound by any particular theory, such transfer can involvemismatch correction of heteroduplex DNA that forms between the brokentarget and the donor, and/or “synthesis-dependent strand annealing,” inwhich the donor is used to resynthesize genetic information that willbecome part of the target, and/or related processes. Such specialized HRoften results in an alteration of the sequence of the target moleculesuch that part or all of the sequence of the donor polynucleotide isincorporated into the target polynucleotide.

In the methods of the disclosure, one or more targeted nucleases asdescribed herein create a double-stranded break in the target sequence(e.g., cellular chromatin) at a predetermined site, and a “donor”polynucleotide, having homology to the nucleotide sequence in the regionof the break, can be introduced into the cell. The presence of thedouble-stranded break (DSB) has been shown to facilitate integration ofthe donor sequence. The donor sequence may be physically integrated or,alternatively, the donor polynucleotide is used as a template for repairof the break via homologous recombination, resulting in the introductionof all or part of the nucleotide sequence as in the donor into thecellular chromatin. Thus, a first sequence in cellular chromatin can bealtered and, in certain embodiments, can be converted into a sequencepresent in a donor polynucleotide. Thus, the use of the terms “replace”or “replacement” can be understood to represent replacement of onenucleotide sequence by another, (i.e., replacement of a sequence in theinformational sense), and does not necessarily require physical orchemical replacement of one polynucleotide by another. In someembodiments, two DSBs are introduced by the targeted nucleases describedherein, resulting in the deletion of the DNA in between the DSBs. Insome embodiments, the “donor” polynucleotides are inserted between thesetwo DSBs.

Thus, in certain embodiments, portions of the donor sequence that arehomologous to sequences in the region of interest exhibit between about80 to 99% (or any integer therebetween) sequence identity to the genomicsequence that is replaced. In other embodiments, the homology betweenthe donor and genomic sequence is higher than 99%, for example if only 1nucleotide differs as between donor and genomic sequences of over 100contiguous base pairs. In certain cases, a non-homologous portion of thedonor sequence can contain sequences not present in the region ofinterest, such that new sequences are introduced into the region ofinterest. In these instances, the non-homologous sequence is generallyflanked by sequences of 50-1,000 base pairs (or any integral valuetherebetween) or any number of base pairs greater than 1,000, that arehomologous or identical to sequences in the region of interest. In otherembodiments, the donor sequence is non-homologous to the first sequence,and is inserted into the genome by non-homologous recombinationmechanisms.

Any of the methods described herein can be used for partial or completeinactivation of one or more target sequences in a cell by targetedintegration of donor sequence that disrupts expression of the gene(s) ofinterest. Cell lines with partially or completely inactivated genes arealso provided.

Furthermore, the methods of targeted integration as described herein canalso be used to integrate one or more exogenous sequences. The exogenousnucleic acid sequence can comprise, for example, one or more genes orcDNA molecules, or any type of coding or noncoding sequence, as well asone or more control elements (e.g., promoters). In addition, theexogenous nucleic acid sequence may produce one or more RNA molecules(e.g., small hairpin RNAs (shRNAs), inhibitory RNAs (RNAis), microRNAs(miRNAs), etc.).

“Cleavage” refers to the breakage of the covalent backbone of a DNAmolecule. Cleavage can be initiated by a variety of methods including,but not limited to, enzymatic or chemical hydrolysis of a phosphodiesterbond. Both single-stranded cleavage and double-stranded cleavage arepossible, and double-stranded cleavage can occur as a result of twodistinct single-stranded cleavage events. DNA cleavage can result in theproduction of either blunt ends or staggered ends. In certainembodiments, fusion polypeptides are used for targeted double-strandedDNA cleavage.

A “cleavage half-domain” is a polypeptide sequence which, in conjunctionwith a second polypeptide (either identical or different) forms acomplex having cleavage activity (preferably double-strand cleavageactivity). The terms “first and second cleavage half-domains;” “+ and −cleavage half-domains” and “right and left cleavage half-domains” areused interchangeably to refer to pairs of cleavage half-domains thatdimerize.

An “engineered cleavage half-domain” is a cleavage half-domain that hasbeen modified so as to form obligate heterodimers with another cleavagehalf-domain (e.g., another engineered cleavage half-domain). See, also,U.S. Patent Publication Nos. 2005/0064474, 20070218528 and 2008/0131962,incorporated herein by reference in their entireties.

“Chromatin” is the nucleoprotein structure comprising the cellulargenome. Cellular chromatin comprises nucleic acid, primarily DNA, andprotein, including histones and non-histone chromosomal proteins. Themajority of eukaryotic cellular chromatin exists in the form ofnucleosomes, wherein a nucleosome core comprises approximately 150 basepairs of DNA associated with an octamer comprising two each of histones.H2A, H2B, H3 and H4; and linker DNA (of variable length depending on theorganism) extends between nucleosome cores. A Molecule of histone H1 isgenerally associated with the linker DNA. For the purposes of thepresent disclosure, the term “chromatin” is meant to encompass all typesof cellular nucleoprotein, both prokaryotic and eukaryotic. Cellularchromatin includes both chromosomal and episomal chromatin.

A “chromosome,” is a chromatin complex comprising all or a portion ofthe genome of a cell. The genome of a cell is often characterized by itskaryotype, which is the collection of all the chromosomes that comprisethe genome of the cell. The genome of a cell can comprise one or morechromosomes.

An “episome” is a replicating nucleic acid, nucleoprotein complex orother structure comprising a nucleic acid that is not part of thechromosomal karyotype of a cell. Examples of episomes include plasmidsand certain viral genomes.

A “target site” or “target sequence” is a nucleic acid sequence thatdefines a portion of a nucleic acid to which a binding molecule willbind, provided sufficient conditions for binding exist. For example, thesequence 5′-GAATTC-3′ is a target site for the Eco RI restrictionendonuclease.

An “exogenous” molecule is a molecule that is not normally present in acell, but can be introduced into a cell by one or more genetic,biochemical or other methods. “Normal presence in the cell” isdetermined with respect to the particular developmental stage andenvironmental conditions of the cell. Thus, for example, a molecule thatis present only during embryonic development of muscle is an exogenousmolecule with respect to an adult muscle cell. Similarly, a moleculeinduced by heat shock is an exogenous molecule with respect to anon-heat-shocked cell. An exogenous molecule can comprise, for example,a functioning version of a malfunctioning endogenous molecule or amalfunctioning version of a normally-functioning endogenous molecule.

An exogenous molecule can be, among other things, a small molecule, suchas is generated by a combinatorial chemistry process, or a macromoleculesuch as a protein, nucleic acid, carbohydrate, lipid, glycoprotein,lipoprotein, polysaccharide, any modified derivative of the abovemolecules, or any complex comprising one or more of the above molecules.Nucleic acids include DNA and RNA, can be single- or double-stranded;can be linear, branched or circular; and can be of any length. Nucleicacids include those capable of forming duplexes, as well astriplex-forming nucleic acids. See, for example, U.S. Pat. Nos.5,176,996 and 5,422,251. Proteins include, but are not limited to,DNA-binding proteins, transcription factors, chromatin remodelingfactors, methylated DNA binding proteins, polymerases, methylases,demethylases, acetylases, deacetylases, kinases, phosphatases,integrases, recombinases, ligases, topoisomerases, gyrases andhelicases.

An exogenous molecule can be the same type of molecule as an endogenousmolecule, e.g., an exogenous protein or nucleic acid. For example, anexogenous nucleic acid can comprise an infecting viral genome, a plasmidor episome introduced into a cell, or a chromosome that is not normallypresent in the cell. Methods for the introduction of exogenous moleculesinto cells are known to those of skill in the art and include, but arenot limited to, lipid-mediated transfer (i.e., liposomes, includingneutral and cationic lipids), electroporation, direct injection, cellfusion, particle bombardment, calcium phosphate co-precipitation,DEAE-dextran-mediated transfer and viral vector-mediated transfer.

By contrast, an “endogenous” molecule is one that is normally present ina particular cell at a particular developmental stage under particularenvironmental conditions. For example, an endogenous nucleic acid cancomprise a chromosome, the genome of a mitochondrion, chloroplast orother organelle, or a naturally-occurring episomal nucleic acid.Additional endogenous molecules can include proteins, for example,transcription factors and enzymes.

A “fusion” molecule is a molecule in which two or more subunit moleculesare linked, preferably covalently. The subunit molecules can be the samechemical type of molecule, or can be different chemical types ofmolecules. Examples of the first type of fusion molecule include, butare not limited to, fusion proteins (for example, a fusion between a ZFPDNA-binding domain and a cleavage domain or between a TAL-effector DNAbinding domain and a cleavage domain) and fusion nucleic acids (forexample, a nucleic acid encoding the fusion protein described supra).Examples of the second type of fusion molecule include, but are notlimited to, a fusion between a triplex-forming nucleic acid and apolypeptide, and a fusion between a minor groove binder and a nucleicacid.

Expression of a fusion protein in a cell can result from delivery of thefusion protein to the cell or by delivery of a polynucleotide encodingthe fusion protein to a cell, wherein the polynucleotide is transcribed,and the transcript is translated, to generate the fusion protein.Trans-splicing, polypeptide cleavage and polypeptide ligation can alsobe involved in expression of a protein in a cell. Methods forpolynucleotide and polypeptide delivery to cells are presented elsewherein this disclosure.

A “gene,” for the purposes of the present disclosure, includes a DNAregion encoding a gene product (see infra), as well as all DNA regionswhich regulate the production of the gene product, whether or not suchregulatory sequences are adjacent to coding and/or transcribedsequences. Accordingly, a gene includes, but is not necessarily limitedto, promoter sequences, terminators, translational regulatory sequencessuch as ribosome binding sites and internal ribosome entry sites,enhancers, silencers, insulators, boundary elements, replicationorigins, matrix attachment sites and locus control regions.

“Gene expression” refers to the conversion of the information, containedin a gene, into a gene product. A gene product can be the directtranscriptional product of a gene (e.g., mRNA, tRNA, rRNA, antisenseRNA, ribozyme, structural RNA, shRNA, RNAi, miRNA or any other type ofRNA) or a protein produced by translation of a mRNA. Gene products alsoinclude RNAs which are modified, by processes such as capping,polyadenylation, methylation, and editing, and proteins modified by, forexample, methylation, acetylation, phosphorylation, ubiquitination,ADP-ribosylation, myristilation, and glycosylation.

“Modulation” of gene expression refers to a change in the activity of agene. Modulation of expression can include, but is not limited to, geneactivation and gene repression. Genome editing (e.g., cleavage,alteration, inactivation, donor integration, random mutation) can beused to modulate expression. Gene inactivation refers to any reductionin gene expression as compared to a cell that does not include amodifier as described herein. Thus, gene inactivation may be partial orcomplete.

A “region of interest” is any region of cellular chromatin, such as, forexample, a gene or a non-coding sequence within or adjacent to a gene,in which it is desirable to bind an exogenous molecule. Binding can befor the purposes of targeted DNA cleavage and/or targeted recombination.A region of interest can be present in a chromosome, an episome, anorganellar genome (e.g., mitochondrial, chloroplast), or an infectingviral genome, for example. A region of interest can be within the codingregion of a gene, within transcribed non-coding regions such as, forexample, leader sequences, trailer sequences or introns, or withinnon-transcribed regions, either upstream or downstream of the codingregion. A region of interest can be as small as a single nucleotide pairor up to 2,000 nucleotide pairs in length, or any integral value ofnucleotide pairs.

“Eukaryotic” cells include, but are not limited to, fungal cells (suchas yeast), plant cells, animal cells, mammalian cells and human cells(e.g., T-cells).

“Plant” cells include, but are not limited to, cells of monocotyledonous(monocots) or dicotyledonous (dicots) plants. Non-limiting examples ofmonocots include cereal plants such as maize, rice, barley, oats, wheat,sorghum, rye, sugarcane, pineapple, onion, banana, and coconut.Non-limiting examples of dicots include tobacco, tomato, sunflower,cotton, sugarbeet, potato, lettuce, melon, soybean, canola (rapeseed),and alfalfa. Plant cells may be from any part of the plant and/or fromany stage of plant development.

The terms “operative linkage” and “operatively linked” (or “operablylinked”) are used interchangeably with reference to a juxtaposition oftwo or more components (such as sequence elements), in which thecomponents are arranged such that both components function normally andallow the possibility that at least one of the components can mediate afunction that is exerted upon at least one of the other components. Byway of illustration, a transcriptional regulatory sequence, such as apromoter, is operatively linked to a coding sequence if thetranscriptional regulatory sequence controls the level of transcriptionof the coding sequence in response to the presence or absence of one ormore transcriptional regulatory factors. A transcriptional regulatorysequence is generally operatively linked in cis with a coding sequence,but need not be directly adjacent to it. For example, an enhancer is atranscriptional regulatory sequence that is operatively linked to acoding sequence, even though they are not contiguous.

With respect to fusion polypeptides, the term “operatively linked” canrefer to the fact that each of the components performs the same functionin linkage to the other component as it would if it were not so linked.For example, with respect to a fusion polypeptide in which a ZFPDNA-binding domain is fused to a cleavage domain, the ZFP DNA-bindingdomain and the cleavage domain are in operative linkage if, in thefusion polypeptide, the ZFP DNA-binding domain portion is able to bindits target site and/or its binding site, while the cleavage domain isable to cleave DNA in the vicinity of the target site.

A “functional fragment” of a protein, polypeptide or nucleic acid is aprotein, polypeptide or nucleic acid whose sequence is not identical tothe full-length protein, polypeptide or nucleic acid, yet retains thesame function as the full-length protein, polypeptide or nucleic acid. Afunctional fragment can possess more, fewer, or the same number ofresidues as the corresponding native molecule, and/or can contain one ormore amino-acid or nucleotide substitutions. Methods for determining thefunction of a nucleic acid (e.g., coding function, ability to hybridizeto another nucleic acid) are well-known in the art. Similarly, methodsfor determining protein function are well-known. For example, theDNA-binding function of a polypeptide can be determined, for example, byfilter-binding, electrophoretic mobility-shift, or immunoprecipitationassays. DNA cleavage can be assayed by gel electrophoresis. See Ausubelet al., supra. The ability of a protein to interact with another proteincan be determined, for example, by co-immunoprecipitation, two-hybridassays or complementation, both genetic and biochemical. See, forexample, Fields et al. (1989) Nature 340:245-246; U.S. Pat. No.5,585,245 and PCT WO 98/44350.

A “vector” is capable of transferring gene sequences to target cells.Typically, “vector construct,” “expression vector,” and “gene transfervector,” mean any nucleic acid construct capable of directing theexpression of a gene of interest and which can transfer gene sequencesto target cells. Thus, the term includes cloning, and expressionvehicles, as well as integrating vectors.

A “reporter gene” or “reporter sequence” refers to any sequence thatproduces a protein product that is easily measured, preferably in aroutine assay. Suitable reporter genes for particular species will beknown to the skilled artisan and include, but are not limited to, Mel1,chloramphenicol acetyl transferase (CAT), light generating proteins suchas GFP, luciferase and/or β-galactosidase. Suitable reporter genes foranimals may also encode markers or enzymes that can be measured in vivosuch as thymidine kinase, measured in vivo using PET scanning, orluciferase, measured in vivo via whole body luminometric imaging.Selectable markers can also be used instead of, or in addition to,reporters. Positive selection markers are those polynucleotides thatencode a product that enables only cells that carry and express the geneto survive and/or grow under certain conditions. For example, cells thatexpress neomycin resistance (Neo^(r)) gene are resistant to the compoundG418, while cells that do not express Neo^(r) are skilled by G418.Likewise, plant cells that express an herbicide tolerance (resistance)gene (e.g., PAT (phosphinothricin acetyl transferase) gene), whichconfers resistance to the herbicide bialaphos. Other examples ofpositive selection markers including hygromycin resistance and the likewill be known to those of skill in the art. Negative selection markersare those polynucleotides that encode a product that enables only cellsthat carry and express the gene to be killed under certain conditions.For example, cells that express thymidine kinase (e.g., herpes simplexvirus thymidine kinase, HSV-TK) are killed when gancyclovir is added.Other negative selection markers are known to those skilled in the art.The selectable marker need not be a transgene and, additionally,reporters and selectable markers can be used in various combinations.

Overview

Described herein are compositions and methods for generatinghomozygously modified, including knock-out (KO) organisms withoutinserted exogenous sequences such as selectable markers and organismscontaining a transgene without sequences encoding reporters (e.g.,selectable markers) at both alleles of the desired locus. The organismsare typically generated in two steps. In the first step, one or morenucleases (e.g., ZFNs) are used for targeted integration (TI) of aheterologous, donor-derived sequence of interest into the desired locusin the cell. The heterologous sequence typically contains a reporter(e.g., selectable or screening marker) that allows for selection ofclones with a reporter-TI at one allele of the locus of interest. For TIof a transgene, a desired transgene donor (lacking reporter sequences)is co-introduced with the reporter donor. The reporter-TI-selectedclones are then genotyped at the non-reporter-TI allele to identifycells in which the non-reporter-TI allele is disrupted by NHEJ, or toidentify cells that contain the non-reporter marker transgene insertedat the non-reporter-TI allele.

In a second step, the reporter-TI/modified clones (e.g.,reporter-TI/NHEJ or reporter-TI/non-reporter TI clones) identified asabove are allowed to develop to reproductive maturity and then thesereporter-TI/modified heterozygous organisms are crossed to each other orself crossed. One-quarter of the progeny of the reporter-TI/modifiedorganisms from these crosses are expected to be homozygous for themodified events (NHEJ/NHEJ or non-reporter TI/non-reporter TI), thusproviding homozygously modified organisms without any inserted reporterDNA.

Nucleases

The methods and compositions described herein are broadly applicable andmay involve any nuclease of interest. Non-limiting examples of nucleasesinclude meganucleases, zinc finger nucleases and TALENs. The nucleasemay comprise heterologous DNA-binding and cleavage domains (e.g., zincfinger nucleases; meganuclease DNA-binding domains with heterologouscleavage domains or TALENs) or, alternatively, the DNA-binding domain ofa naturally-occurring nuclease may be altered to bind to a selectedtarget site (e.g., a meganuclease that has been engineered to bind tosite, different than the cognate binding site).

In certain embodiments, the nuclease is a meganuclease (homingendonuclease). Naturally-occurring meganucleases recognize 15-40base-pair cleavage sites and are commonly grouped into four families:the LAGLIDADG family, the GIY-YIG family, the His-Cyst box family andthe HNH family. Exemplary homing endonucleases include I-SceI, I-CeuI,PI-PspI, PI-Sce, I-SceIV, I-CsmI, I-PanI, I-SceII, I-PpoI, I-SceIII,I-CreI, I-TevI, I-TevII and I-TevIII. Their recognition sequences areknown. See also U.S. Pat. No. 5,420,032; U.S. Pat. No. 6,833,252;Belfort et al. (1997) Nucleic Acids Res. 25:3379-3388; Dujon et al.(1989) Gene 82:115-118; Perler et al. (1994) Nucleic Acids Res. 22,1125-1127; Jasin (1996) Trends Genet. 12:224-228; Gimble et al. (1996)J. Mol. Biol. 263:163-180; Argast et al. (1998) J. Mol. Biol.280:345-353 and the New England Biolabs catalogue.

DNA-binding domains from naturally-occurring meganucleases, primarilyfrom the LAGLIDADG family, have been used to promote site-specificgenome modification in plants, yeast, Drosophila, mammalian cells andmice, but this approach has been limited to the modification of eitherhomologous genes that conserve the meganuclease recognition sequence(Monet et al. (1999), Biochem. Biophysics. Res. Common. 255: 88-93) orto pre-engineered genomes into which a recognition sequence has beenintroduced (Route et al. (1994), Mol. Cell. Biol. 14: 8096-106; Chiltonet al. (2003), Plant Physiology. 133: 956-65; Puchta et al. (1996),Proc. Natl. Acad. Sci. USA 93: 5055-60; Rong et al. (2002), Genes Dev.16: 1568-81; Gouble et al. (2006), J. Gene Med. 8(5):616-622).Accordingly, attempts have been made to engineer meganucleases toexhibit novel binding specificity at medically or biotechnologicallyrelevant sites (Porteus et al. (2005), Nat. Biotechnol. 23: 967-73;Sussman et al. (2004), J. Mol. Biol. 342: 31-41; Epinat et al. (2003),Nucleic Acids Res. 31: 2952-62; Chevalier et al. (2002) Molec. Cell10:895-905; Epinat et al. (2003) Nucleic Acids Res. 31:2952-2962; Chameset al. (2005) Nucleic Acids Res 33(20):e178; Arnould et al. (2006) J.Mol. Biol. 355:443-458; Ashworth et al. (2006) Nature 441:656-659;Paques et al. (2007) Current Gene Therapy 7:49-66; U.S. PatentPublication Nos. 20070117128; 20060206949; 20060153826; 20060078552; and20040002092). In addition, naturally-occurring or engineered DNA-bindingdomains from meganucleases have also been operably linked with acleavage domain from a heterologous nuclease (e.g., FokI).

The plant pathogenic bacteria of the genus Xanthomonas are known tocause many diseases in important crop plants. Pathogenicity ofXanthomonas depends on a conserved type III secretion (T3S) system whichinjects more than 25 different effector proteins into the plant cell.Among these injected proteins are transcription activator-like (TAL)effectors which mimic plant transcriptional activators and manipulatethe plant transcriptome (see Kay et al (2007) Science 318:648-651).These proteins contain a DNA binding domain and a transcriptionalactivation domain. One of the most well characterized TAL-effectors isAvrBs3 from Xanthomonas campestgris pv. Vesicatoria (see Bonas et al(1989) Mol Gen Genet. 218: 127-136 and WO2010079430). TAL-effectorscontain a centralized domain of tandem repeats, each repeat containingapproximately 34 amino acids, which are key to the DNA bindingspecificity of these proteins. In addition, they contain anuclearlocalization sequence and an acidic transcriptional activation domain(for a review see Schornack S, et al (2006) J Plant Physiol 163(3):256-272). In addition, in the phytopathogenic bacteria Ralstoniasolanacearum two genes, designated brg11 and hpx17, have been found thatare homologous to the AvrBs3 family of Xanthomonas in the R.solanacearum biovar 1 strain GMI1000 and in the biovar 4 strain RS1000(See Heuer et al (2007) Appl and Envir Micro 73(13): 4379-4384). Thesegenes are 98.9% identical in nucleotide sequence to each other butdiffer by a deletion of 1,575 bp in the repeat domain of hpx17. However,both gene products have less than 40% sequence identity with AvrBs3family proteins of Xanthomonas.

Specificity of these TAL effectors depends on the sequences found in thetandem repeats. The repeated sequence comprises approximately 102 bp andthe repeats are typically 91-100% homologous with each other (Bonas etal, ibid). Polymorphism of the repeats is usually located at positions12 and 13 and there appears to be a one-to-one correspondence betweenthe identity of the hypervariable diresidues at positions 12 and 13 withthe identity of the contiguous nucleotides in the TAL-effector's targetsequence (see Moscou and Bogdanove, (2009) Science 326:1501 and Boch etal (2009) Science 326:1509-1512). Experimentally, the code for DNArecognition of these TAL-effectors has been determined such that an HDsequence at positions 12 and 13 leads to a binding to cytosine (C), NGbinds to T, NI to A, C, G or T, NN binds to A or G, and IG binds to T.These DNA binding repeats have been assembled into proteins with newcombinations and numbers of repeats, to make artificial transcriptionfactors that are able to interact with new sequences and activate theexpression of a reporter gene in plant cells (Boch et al, ibid).However, these DNA binding domains have not been shown to have generalapplicability in the field of targeted genomic editing or targeted generegulation in all cell types. In particular, Boch et al showed functionin plant cells only (namely, in the biological setting for which thesedomains have evolved to function in) and did not demonstrate activity atan endogenous locus. Moreover, engineered TAL-effectors have not beenshown to function in association with any exogenous functional proteineffector domains (nuclease, transcription factor, regulatory, enzymatic,recombinase, methylase, and/or reporter domains) not naturally found innatural Xanthomonas TAL-effector proteins in mammalian cells. In arecent publication by Christian et al ((2010)<Genetics epub10.1534/genetics. 110.120717), engineered TAL proteins were linked to aFokI cleavage half domain to yield a TAL effector domain nuclease fusion(TALEN) and were shown to be active in a yeast reporter assay wherecleavage of the plasmid based target is require for the assay.

In other embodiments, the nuclease is a zinc finger nuclease (ZFN). ZFNscomprise a zinc finger protein that has been engineered to bind to atarget site in a gene of choice and cleavage domain or a cleavagehalf-domain.

Zinc finger binding domains can be engineered to bind to a sequence ofchoice. See, for example, Beerli et al. (2002) Nature Biotechnol.20:135-141; Pabo et al. (2001) Ann. Rev. Biochem. 70:313-340; Isalan etal. (2001) Nature Biotechnol. 19:656-660; Segal et al. (2001) Curr.Opin. Biotechnol. 12:632-637; Choo et al. (2000) Curr. Opin. Struct.Biol. 10:411-416. An engineered zinc finger binding domain can have anovel binding specificity, compared to a naturally-occurring zinc fingerprotein. Engineering methods include, but are not limited to, rationaldesign and various types of selection. Rational design includes, forexample, using databases comprising triplet (or quadruplet) nucleotidesequences and individual zinc finger amino acid sequences, in which eachtriplet or quadruplet nucleotide sequence is associated with one or moreamino acid sequences of zinc fingers which bind the particular tripletor quadruplet sequence. See, for example, co-owned U.S. Pat. Nos.6,453,242 and 6,534,261, incorporated by reference herein in theirentireties.

Exemplary selection methods, including phage display and two-hybridsystems, are disclosed in U.S. Pat. Nos. 5,789,538; 5,925,523;6,007,988; 6,013,453; 6,410,248; 6,140,466; 6,200,759; and 6,242,568; aswell as WO 98/37186; WO 98/53057; WO 00/27878; WO 01/88197 and GB2,338,237. In addition, enhancement of binding specificity for zincfinger binding domains has been described, for example, in co-owned WO02/077227.

Selection of target sites; ZFNs and methods for design and constructionof fusion proteins (and polynucleotides encoding same) are known tothose of skill in the art and described in detail in U.S. PatentApplication Publication Nos. 20050064474 and 20060188987, incorporatedby reference in their entireties herein.

In addition, as disclosed in these and other references, zinc fingerdomains and/or multi-fingered zinc finger proteins may be linkedtogether using any suitable linker sequences, including for example,linkers of 5 or more amino acids in length. See, e.g., U.S. Pat. Nos.6,479,626; 6,903,185; and 7,153,949 for exemplary linker sequences 6 ormore amino acids in length. The proteins described herein may includeany combination of suitable linkers between the individual zinc fingersof the protein.

Nucleases such as ZFNs, TALENs and/or meganucleases also comprise anuclease (cleavage domain, cleavage half-domain). As noted above, thecleavage domain may be heterologous to the DNA-binding domain, forexample a zinc finger DNA-binding domain and a cleavage domain from anuclease or a meganuclease DNA-binding domain or a TAL-effector domainand a cleavage domain from a different nuclease. Heterologous cleavagedomains can be obtained from any endonuclease or exonuclease. Exemplaryendonucleases from which a cleavage domain can be derived include, butare not limited to, restriction endonucleases and homing endonucleases.See, for example, 2002-2003 Catalogue, New England Biolabs, Beverly,Mass.; and Belfort et al. (1997) Nucleic Acids Res. 25:3379-3388.Additional enzymes which cleave DNA are known (e.g., S1 Nuclease; mungbean nuclease; pancreatic DNase I; micrococcal nuclease; yeast HOendonuclease; see also Linn et al. (eds.) Nucleases, Cold Spring HarborLaboratory Press; 1993). One or more of these enzymes (or functionalfragments thereof) can be used as a source of cleavage domains andcleavage half-domains.

Similarly, a cleavage half-domain can be derived from any nuclease orportion thereof, as set forth above, that requires dimerization forcleavage activity. In general, two fusion proteins are required forcleavage if the fusion proteins comprise cleavage half-domains.Alternatively, a single protein comprising two cleavage half-domains canbe used. The two cleavage half-domains can be derived from the sameendonuclease (or functional fragments thereof), or each cleavagehalf-domain can be derived from a different endonuclease (or functionalfragments thereof). In addition, the target sites for the two fusionproteins are preferably disposed, with respect to each other, such thatbinding of the two fusion proteins to their respective, target sitesplaces the cleavage half-domains in a spatial orientation to each otherthat allows the cleavage half-domains to form a functional cleavagedomain, e.g., by dimerizing. Thus, in certain embodiments, the nearedges of the target sites are separated by 5-8 nucleotides or by 15-18nucleotides. However any integral number of nucleotides or nucleotidepairs can intervene between two target sites (e.g., from 2 to 50nucleotide pairs or more). In general, the site of cleavage lies betweenthe target sites.

Restriction endonucleases (restriction enzymes) are present in manyspecies and are capable of sequence-specific binding to DNA (at arecognition site), and cleaving DNA at or near the site of binding.Certain restriction enzymes (e.g., Type IIS) cleave DNA at sites removedfrom the recognition site and have separable binding and cleavagedomains. For example, the Type IIS enzyme Fok I catalyzesdouble-stranded cleavage of DNA, at 9 nucleotides from its recognitionsite on one strand and 13 nucleotides from its recognition site on theother. See, for example, U.S. Pat. Nos. 5,356,802; 5,436,150 and5,487,994; as well as Li et al. (1992) Proc. Natl. Acad. Sci. USA89:4275-4279; Li et al. (1993) Proc. Natl. Acad. Sci. USA 90:2764-2768;Kim et al. (1994a) Proc. Natl. Acad. Sci. USA 91:883-887; Kim et al.(1994b) J. Biol. Chem. 269:31, 978-31,982. Thus, in one embodiment,fusion proteins comprise the cleavage domain (or cleavage half-domain)from at least one Type IIS restriction enzyme and one or more zincfinger binding domains, which may or may not be engineered.

An exemplary Type IIS restriction enzyme, whose cleavage domain isseparable from the binding domain, is Fok I. This particular enzyme isactive as a dimer. Bitinaite et al. (1998) Proc. Natl. Acad. Sci. USA95: 10,570-10,575. Accordingly, for the purposes of the presentdisclosure, the portion of the Fok I enzyme used in the disclosed fusionproteins is considered a cleavage half-domain. Thus, for targeteddouble-stranded cleavage and/or targeted replacement of cellularsequences using zinc finger-Fok I fusions, two fusion proteins, eachcomprising a Fokl cleavage half-domain, can be used to reconstitute acatalytically active cleavage domain. Alternatively, a singlepolypeptide molecule containing a zinc finger binding domain and two FokI cleavage half-domains can also be used. Parameters for targetedcleavage and targeted sequence alteration using zinc finger-Fok Ifusions are provided elsewhere in this disclosure.

A cleavage domain or cleavage half-domain can be any portion of aprotein that retains cleavage activity, or that retains the ability tomultimerize (e.g., dimerize) to form a functional cleavage domain.

Exemplary Type IIS restriction enzymes are described in InternationalPublication WO 07/014,275, incorporated herein in its entirety.Additional restriction enzymes also contain separable binding andcleavage domains, and these are contemplated by the present disclosure.See, for example, Roberts et al. (2003) Nucleic Acids Res. 31:418-420.

In certain embodiments, the cleavage domain comprises one or moreengineered cleavage half-domain (also referred to as dimerization domainmutants) that minimize or prevent homodimerization, as described, forexample, in U.S. Patent Publication Nos. 20050064474 and 20060188987 andin U.S. application Ser. No. 11/805,850 (filed May 23, 2007), thedisclosures of all of which are incorporated by reference in theirentireties herein. Amino acid residues at positions 446, 447, 479, 483,484, 486, 487, 490, 491, 496, 498, 499, 500, 531, 534, 537, and 538 ofFok I are all targets for influencing dimerization of the Fok I cleavagehalf-domains.

Exemplary engineered cleavage half-domains of Fok I that form obligateheterodimers include a pair in which a first cleavage half-domainincludes mutations at amino acid residues at positions 490 and 538 ofFok I and a second cleavage half-domain includes mutations at amino acidresidues 486 and 499.

Thus, in one embodiment, a mutation at 490 replaces Glu (E) with Lys(K); the mutation at 538 replaces Iso (I) with Lys (K); the mutation at486 replaced Gln (Q) with Glu (E); and the mutation at position 499replaces Iso (I) with Lys (K). Specifically, the engineered cleavagehalf-domains described herein were prepared by mutating positions 490(E→K) and 538 (I→K) in one cleavage half-domain to produce an engineeredcleavage half-domain designated “E490K:I538K” and by mutating positions486 (Q→E) and 499 (I→L) in another cleavage half-domain to produce anengineered cleavage half-domain designated “Q486E:I499L”. The engineeredcleavage half-domains described herein are obligate heterodimer mutantsin which aberrant cleavage is minimized or abolished. See, e.g., U.S.Patent Publication No. 2008/0131962, the disclosure of which isincorporated by reference in its entirety for all purposes.

The engineered cleavage half-domains described herein are obligateheterodimer mutants in which aberrant cleavage is minimized orabolished. See, e.g., Example 1 of WO 07/139,898. In certainembodiments, the engineered cleavage half-domain comprises mutations atpositions 486, 499 and 496 (numbered relative to wild-type FokI), forinstance mutations that replace the wild type Gln (Q) residue atposition 486 with a Glu (E) residue, the wild type Iso (I) residue atposition 499 with a Leu (L) residue and the wild-type Asn (N) residue atposition 496 with an Asp (D) or Glu (E) residue (also referred to as a“ELD” and “ELE” domains, respectively). In other embodiments, theengineered cleavage half-domain comprises mutations at positions 490,538 and 537 (numbered relative to wild-type FokI), for instancemutations that replace the wild type Glu (E) residue at position 490with a Lys (K) residue, the wild type Iso (I) residue at position 538with a Lys (K) residue, and the wild-type His (H) residue at position537 with a Lys (K) residue or a Arg (R) residue (also referred to as“KKK” and “KKR” domains, respectively). In other embodiments, theengineered cleavage half-domain comprises mutations at positions 490 and537 (numbered relative to wild-type FokI), for instance mutations thatreplace the wild type Glu (E) residue at position 490 with a Lys (K)residue and the wild-type His (H) residue at position 537 with a Lys (K)residue or a Arg (R) residue (also referred to as “KIK” and “KIR”domains, respectively). (See U.S. provisional application 61/337,769filed Feb. 8, 2010).

Engineered cleavage half-domains described herein can be prepared usingany suitable method, for example, by site-directed mutagenesis ofwild-type cleavage half-domains (Fok I) as described in U.S. PatentPublication Nos. 20050064474 and 20080131962.

Alternatively, nucleases may be assembled in vivo at the nucleic acidtarget site using so-called “split-enzyme” technology (see e.g. U.S.Patent Publication No. 20090068164). Components of such split enzymesmay be expressed either on separate expression constructs, or can belinked in one open reading frame where the individual components areseparated, for example, by a self-cleaving 2A peptide or IRES sequence.Components may be individual zinc finger binding domains or domains of ameganuclease nucleic acid binding domain.

Nucleases (e.g., ZFNs) can be screened for activity prior to use, forexample in a yeast-based chromosomal system as described in WO2009/042163 and 20090068164.

Expression Vectors

A nucleic acid encoding one or more nucleases can be cloned into avector for transformation into prokaryotic or eukaryotic cells. Vectorscan be prokaryotic vectors, e.g., plasmids, or shuttle vectors, insectvectors, or eukaryotic vectors, including plant vectors describedherein.

Nuclease expression constructs can be readily designed using methodsknown in the art. See, e.g., United States Patent Publications20030232410; 20050208489; 20050026157; 20050064474; 20060188987;20060063231; 20080182332; 2009011188 and International Publication WO07/014,275. Expression of the nuclease may be under the control of aconstitutive promoter or an inducible promoter, for example thegalactokinase promoter which is activated (de-repressed) in the presenceof raffinose and/or galactose and repressed in presence of glucose.Non-limiting examples of plant promoters include promoter sequencesderived from A. thaliana ubiquitin-3 (ubi-3) (Callis, et al., 1990, J.Biol. Chem. 265-12486-12493); A. tumifaciens mannopine synthase (Δmas)(Petolino et al., U.S. Pat. No. 6,730,824); and/or Cassaya Vein MosaicVirus (CsVMV) (Verdaguer et al., (1996) Plant Molecular Biology31:1129-1139). Additional suitable bacterial and eukaryotic promotersare well known in the art and described, e.g., in Sambrook et al.,Molecular Cloning, A Laboratory Manual (2nd ed. 1989; 3^(rd) ed., 2001);Kriegler, Gene Transfer and Expression: A Laboratory Manual (1990); andCurrent Protocols in Molecular Biology (Ausubel et al., supra. Bacterialexpression systems for expressing the ZFP are available in, e.g., E.coli, Bacillus sp., and Salmonella (Palva et al. (1983) Gene22:229-235).

In addition to the promoter, the expression vector typically contains atranscription unit or expression cassette that contains all theadditional elements required for the expression of the nucleic acid inhost cells, either prokaryotic or eukaryotic. A typical expressioncassette thus contains a promoter operably linked, e.g., to a nucleicacid sequence encoding the nuclease, and signals required, e.g., forefficient polyadenylation of the transcript, transcriptionaltermination, ribosome binding sites, or translation termination.Additional elements of the cassette may include, e.g., enhancers,heterologous splicing signals, and/or a nuclear localization signal(NLS).

Kits for such expression systems are commercially available. Eukaryoticexpression systems for mammalian cells, yeast, plant and insect cellsare well known by those of skill in the art and are also commerciallyavailable.

Any of the well known procedures for introducing foreign nucleotidesequences into such host cells may be used. These include the use ofcalcium phosphate transfection, polybrene, protoplast fusion,electroporation, ultrasonic methods (e.g., sonoporation), liposomes,microinjection, naked DNA, plasmid vectors, viral vectors, both episomaland integrative, and any of the other well known methods for introducingcloned genomic DNA, cDNA, synthetic DNA or other foreign geneticmaterial into a host cell (see, e.g., Sambrook et al., supra). It isonly necessary that the particular genetic engineering procedure used becapable of successfully introducing at least one gene into the host cellcapable of expressing the protein of choice.

DNA constructs may be introduced into (e.g., into the genome of) adesired plant host by a variety of conventional techniques. For reviewsof such techniques see, for example, Weissbach & Weissbach Methods forPlant Molecular Biology (1988, Academic Press, N.Y.) Section VIII, pp.421-463; and Grierson & Corey, Plant Molecular Biology (1988, 2d Ed.),Blackie, London, Ch. 7-9.

For example, the DNA construct may be introduced directly into thegenomic DNA of the plant cell using techniques such as electroporationand microinjection of plant cell protoplasts, or the DNA constructs canbe introduced directly to plant tissue using biolistic methods, such asDNA particle bombardment (see, e.g., Klein et al (1987) Nature327:70-73). Alternatively, the DNA constructs may be combined withsuitable T-DNA flanking regions and introduced into a conventionalAgrobacterium tumefaciens host vector. Agrobacteriumtumefaciens-mediated transformation techniques, including disarming anduse of binary vectors, are well described in the scientific literature.See, for example Horsch et al (1984) Science 233:496-498, and Fraley etal (1983) Proc. Nat'l. Acad. Sci. USA 80:4803.

In addition, gene transfer may be achieved using non-Agrobacteriumbacteria or viruses such as Rhizobium sp. NGR234, Sinorhizoboiummeliloti, Mesorhizobium loti, potato virus X, cauliflower mosaic virusand cassaya vein mosaic virus and/or tobacco mosaic virus, See, e.g.,Chung et al. (2006) Trends Plant Sci. 11(1):1-4.

The virulence functions of the Agrobacterium tumefaciens host willdirect the insertion of the construct and adjacent marker into the plantcell DNA when the cell is infected by the bacteria using binary T DNAvector (Bevan (1984) Nuc. Acid Res. 12:8711-8721) or the co-cultivationprocedure (Horsch et al (1985) Science 227:1229-1231). Generally, theAgrobacterium transformation system is used to engineer dicotyledonousplants (Bevan et al (1982) Ann. Rev. Genet. 16:357-384; Rogers et al(1986) Methods Enzymol. 118:627-641). The Agrobacterium transformationsystem may also be used to transform, as well as transfer, DNA tomonocotyledonous plants and plant cells. See U.S. Pat. No. 5,591,616;Hernalsteen et al (1984) EMBO J 3:3039-3041; Hooykass-Van Slogteren etal (1984) Nature 311:763-764; Grimsley et al (1987) Nature 325:1677-179;Boulton et al (1989) Plant Mol. Biol. 12:31-40.; and Gould et al (1991)Plant Physiol. 95:426-434.

Alternative gene transfer and transformation methods include, but arenot limited to, protoplast transformation through calcium-, polyethyleneglycol (PEG)- or electroporation-mediated uptake of naked DNA (seePaszkowski et al. (1984) EMBO J 3:2717-2722, Potrykus et al. (1985)Molec. Gen. Genet. 199:169-177; Fromm et al. (1985) Proc. Nat. Acad.Sci. USA 82:5824-5828; and Shimamoto (1989) Nature 338:274-276) andelectroporation of plant tissues (D'Halluin et al. (1992) Plant Cell4:1495 -1505). Additional methods for plant cell transformation includemicroinjection, silicon carbide mediated DNA uptake (Kaeppler et al.(1990) Plant Cell Reporter 9:415-418), and microprojectile bombardment(see Klein et al. (1988) Proc. Nat. Acad. Sci. USA 85:4305-4309; andGordon-Kamm et al. (1990) Plant Cell 2:603-618).

Administration of effective amounts is by any of the routes normallyused for introducing nucleases into ultimate contact with the cell to betreated. The nucleases are administered in any suitable manner,preferably with pharmaceutically acceptable carriers. Suitable methodsof administering such modulators are available and well known to thoseof skill in the art, and, although more than one route can be used toadminister a particular composition, a particular route can oftenprovide a more immediate and more effective reaction than another route.

Carriers may also be used and are determined in part by the particularcomposition being administered, as well as by the particular method usedto administer the composition. Accordingly, there is a wide variety ofsuitable formulations of pharmaceutical compositions that are available(see, e.g., Remington's Pharmaceutical Sciences, 17^(th) ed. 1985).

Organisms

The present invention is applicable to any organism in which it isdesired to create a homozygously modified organism, including but notlimited to eukaryotic organisms such as plants, animals (e.g., mammalssuch as mice, rats, primates, farm animals, rabbits, etc.), fish, andthe like. Typically, the organisms are generated using isolated cellsfrom the organism that can be genetically modified as described hereinand can develop into reproductively mature organisms. Eukaryotic (e.g.,yeast, plant, fungal, piscine and mammalian cells such as feline,canine, murine, bovine, and porcine) cells can be used. Cells fromorganisms containing one or more homozygous KO loci as described hereinor other genetic modifications can also be used.

Exemplary mammalian cells include any cell or cell line of the organismof interest, for example oocytes, K562 cells, CHO (Chinese hamsterovary) cells, HEP-G2 cells, BaF-3 cells, Schneider cells, COS cells(monkey kidney cells expressing SV40 T-antigen), CV-1 cells, HuTu80cells, NTERA2 cells, NB4 cells, HL-60 cells and HeLa cells, 293 cells(see, e.g., Graham et al. (1977) J. Gen. Virol. 36:59), and myelomacells like SP2 or NS0 (see, e.g., Galfre and Milstein (1981) Meth.Enzymol. 73(B):3 46). Peripheral blood mononucleocytes (PBMCs) orT-cells can also be used, as can embryonic and adult stem cells. Forexample, stem cells that can be used include embryonic stem cells (ES),induced pluripotent stem cells (iPSC), mesenchymal stem cells,hematopoietic stem cells, muscle stem cells, skin stem cells andneuronal stem cells.

Exemplary target plants and plant cells include, but are not limited to,those monocotyledonous and dicotyledonous plants, such as cropsincluding grain crops (e.g., wheat, maize, rice, millet, barley), fruitcrops (e.g., tomato, apple, pear, strawberry, orange), forage crops(e.g., alfalfa), root vegetable crops (e.g., carrot, potato, sugarbeets, yam), leafy vegetable crops (e.g., lettuce, spinach); floweringplants (e.g., petunia, rose, chrysanthemum), conifers and pine trees(e.g., pine fir, spruce); plants used in phytoremediation (e.g., heavymetal accumulating plants); oil crops (e.g., sunflower, rape seed) andplants used for experimental purposes (e.g., Arabidopsis). Thus, thedisclosed methods and compositions have use over a broad range ofplants, including, but not limited to, species from the generaAsparagus, Avena, Brassica, Citrus, Citrullus, Capsicum, Cucurbita,Daucus, Erigeron, Glycine, Gossypium, Hordeum, Laduca, Lolium,Lycopersicon, Malus, Manihot, Nicotiana, Orychophragmus, Oryza, Persea,Phaseolus, Pisum, Pyrus, Prunus, Raphanus, Secale, Solanum, Sorghum,Triticum, Vitis, Vigna, and Zea. The term plant cells include isolatedplant cells as well as whole plants or portions of whole plants such asseeds, callus, leaves, roots, etc. The present disclosure alsoencompasses seeds of the plants described above wherein the seed has thetransgene or gene construct. The present diklosure further encompassesthe progeny, clones, cell lines or cells of the transgenic plantsdescribed above wherein said progeny, clone, cell line or cell has thetransgene or gene construct.

Targeted Integration

The first step in generating homozygously modified organisms asdescribed herein involves nuclease-mediated targeted integration of adonor (exogenous) reporter sequence at the desired target locus.Specifically, the disclosed nucleases can be used to cleave DNA at aregion of interest in cellular chromatin (e.g., at a desired orpredetermined site in a genome). For such targeted DNA cleavage, the DNAbinding domain of a nuclease (e.g., zinc finger binding domain) isengineered to bind a target site at or near the predetermined cleavagesite, and a fusion protein comprising the DNA binding domain and acleavage domain is expressed in a cell. Upon binding of the DNA-bindingdomain (e.g., zinc finger portion) of the fusion protein to the targetsite, the DNA is cleaved near the target site by the cleavage domain.

Alternatively, two fusion proteins, each comprising a zinc fingerbinding domain and a cleavage half-domain, are expressed in a cell, andbind to target sites which are juxtaposed in such a way that afunctional cleavage domain is reconstituted and DNA is cleaved in thevicinity of the target sites. In one embodiment, cleavage occurs betweenthe target sites of the two zinc finger binding domains. One or both ofthe zinc finger binding domains can be engineered.

Targeted cleavage by nucleases as described herein has been shown toresult in targeted integration of a donor (exogenous) sequence (viahomology-directed repair) at the site of cleavage. See, e.g., U.S.Patent Publication Nos. 2007/0134796, 2008/029580, 2008/0182332,2009/0117617, and 2009/0111188.

Thus, in addition to the nucleases described herein, targetedreplacement (integration) of a selected genomic sequence also requiresthe introduction of the replacement (or donor) reporter sequence. Thedonor reporter sequence can be introduced into the cell prior to,concurrently with, or subsequent to, expression of the fusionprotein(s). The donor reporter polynucleotide generally containssufficient homology to a genomic sequence to support homologousrecombination (or homology-directed repair) between it and the genomicsequence to which it bears homology. It will be readily apparent thatthe donor sequences are typically not identical to the genomic sequencethat they replace. For example, the sequence of the donorpolynucleotides can contain one or more single base changes, insertions,deletions, inversions or rearrangements with respect to the genomicsequence, so long as sufficient homology with chromosomal sequences ispresent.

In certain embodiments, introduction of a desired transgene may also beaccomplished. The desired transgene donor sequences will also havesufficient homology to the genomic sequence to support homologousrecombination or homology-directed repair between it and the genomicsequence to which it has homology. See, e.g., U.S. patent applicationSer. No. 12/386,059. Donor transgenes of interest typically containsequences encoding a sequence of interest. Non-limiting examples includegene regulator sequences (e.g. promoter sequences), sequences encoding aprotein product (e.g. proteins involved in phenotypic modification ofthe organism or a therapeutic protein) or sequences encoding a RNAproduct such as a shRNA, RNAi etc.

The donor reporter sequence typically includes a sequence encoding areporter gene for identification of cells in which targeted integrationhas occurred. Any reporter gene can be used. In certain embodiments, thereporter gene provides a directly detectable signal directly, forexample, a signal from a fluorescent protein such as, for example, GFP(green fluorescent protein). Fluorescence is detected using a variety ofcommercially available fluorescent detection systems, including, e.g., afluorescence-activated cell sorter (FACS) system.

Reporter genes may also be enzymes that catalyze the production of adetectable product (e.g. proteases, nucleases, lipases, phosphatases,sugar hydrolases and esterases). Non-limiting examples of suitablereporter genes that encode enzymes include, for example, MEL1, CAT(chloramphenicol acetyl transferase; Alton and Vapnek (1979) Nature282:864 869), luciferase, β-galactosidase, β-glucuronidase, β-lactamase,horseradish peroxidase and alkaline phosphatase (e.g., Toh, et al.(1980) Eur. J. Biochem. 182:231 238; and Hall et al. (1983) J. Mol.Appl. Gen. 2:101).

Additional reporter genes include selectable markers (e.g., positiveand/or negative selection markers), including but not limited toantibiotic resistance such as ampicillin resistance, neomycinresistance, G418 resistance, puromycin resistance as well as herbicideresistance such a PAT gene.

The donor polynucleotides (reporter and/or transgene) can be DNA or RNA,single-stranded or double-stranded and can be introduced into a cell inlinear or circular form. If introduced in linear form, the ends of thedonor sequence can be protected (e.g., from exonucleolytic degradation)by methods known to those of skill in the art. For example, one or moredideoxynucleotide residues are added to the 3′ terminus of a linearmolecule and/or self-complementary oligonucleotides are ligated to oneor both ends. See, for example, Chang et al. (1987) Proc. Natl. Acad.Sci. USA 84:4959-4963; Nehls et al. (1996) Science 272:886-889.Additional methods for protecting exogenous polynucleotides fromdegradation include, but are not limited to, addition of terminal aminogroup(s) and the use of modified internucleotide linkages such as, forexample, phosphorothioates, phosphoramidates, and O-methyl ribose ordeoxyribose residues. A polynucleotide can be introduced into a cell aspart of a vector molecule having additional sequences such as, forexample, replication origins, promoters and genes encoding antibiotic orherbicide resistance. Moreover, donor polynucleotides can be introducedas naked nucleic acid, as nucleic acid complexed with an agent such as aliposome or poloxamer, or can be delivered by viruses (e.g., adenovirus,AAV, herpesvirus, retrovirus, lentivirus).

, Cells can be assayed for targeted integration in any suitable way,including by examination (sequencing or PCR) or the selected locus or byselecting and/or screening the treated cells for traits encoded by themarker genes present on the donor DNA. For instance, selection may beperformed by growing the engineered cells on media containing aninhibitory amount of the antibiotic or herbicide to which thetransforming gene construct confers resistance. Further, transformedcells may also be identified by screening for the activities of anyvisible marker genes (e.g., fluorescent proteins, β-glucuronidase,luciferase, B or C1 genes) that may be present on the recombinantnucleic acid constructs. Such selection and screening methodologies arewell known to those skilled in the art.

Physical and biochemical methods also may be used to identify cellscontaining the donor sequences inserted into the targeted locus. Thesemethods include but are not limited to: 1) Southern analysis or PCRamplification for detecting and determining the structure of therecombinant DNA insert; 2) Northern blot, S1 RNase protection,primer-extension or reverse transcriptase-PCR amplification fordetecting and examining RNA transcripts of the gene constructs; 3)enzymatic assays for detecting enzyme or ribozyme activity, where suchgene products are encoded by the gene construct; 4) protein gelelectrophoresis, Western blot techniques, immunoprecipitation, orenzyme-linked immunoassays, where the gene construct products areproteins. Additional techniques, such as in situ hybridization, enzymestaining, and immunostaining, also may be used to detect the presence orexpression of the recombinant construct in specific plant organs andtissues. The methods for doing all these assays are well known to thoseskilled in the art.

Effects of gene manipulation using the methods disclosed herein can beobserved by, for example, northern blots of the RNA (e.g., mRNA)isolated from the tissues of interest. Typically, if the amount of mRNAhas increased, it can be assumed that the corresponding endogenous geneis being expressed at a greater rate than before. Other methods ofmeasuring gene and/or CYP74B activity can be used. Different types ofenzymatic assays can be used, depending on the substrate used and themethod of detecting the increase or decrease of a reaction product orby-product. In addition, the levels of and/or CYP74B protein expressedcan be measured immunochemically, i.e., ELISA, RIA, EIA and otherantibody based assays well known to those of skill in the art, such asby electrophoretic detection assays (either with staining or westernblotting).

Generating Homozygously Modified Organisms

Cells into which the reporter donor sequence has been inserted into thetarget locus are then assayed for the presence of modifications at thenon-reporter-TI allele, for example NHEJ events or insertion of atransgene lacking sequences encoding a reporter (selectable marker).Such reporter-TI/modified cells can be identified using any suitablemethod known to the skilled artisan, including sequencing, PCR analysisand the like.

Subsequently, the reporter-TI/modified mutants are cultured or otherwisetreated such that they generate a whole organism withreporter-TI/modified genotype at the desired locus. For example,traditional methods of pro-nuclear injection or oocyte injection can beused to generate reporter-TI/modified animals. See, e.g., U.S. PatentApplication No. 61/205,970 showing germline transmission of ZFN-modifiedrat oocytes.

Likewise, reporter-TI/modified plant cells can be cultured to regeneratea whole plant which possesses the transformed genotype and thus thedesired phenotype. Such regeneration techniques rely on manipulation ofcertain phytohormones in a tissue culture growth medium, typicallyrelying on a biocide and/or herbicide marker which has been introducedtogether with the desired nucleotide sequences. Plant regeneration fromcultured protoplasts is described in Evans, et al., “ProtoplastsIsolation and Culture” in Handbook of Plant Cell Culture, pp. 124-176,Macmillian Publishing Company, New York, 1983; and Binding, Regenerationof Plants, Plant Protoplasts, pp. 21-73, CRC Press, Boca Raton, 1985.Regeneration can also be obtained from plant callus, explants, organs,pollens, embryos or parts thereof. Such regeneration techniques aredescribed generally in Klee et al (1987) Ann. Rev. of Plant Phys.38:467-486. One of skill in the art will recognize that after theexpression cassette is stably incorporated in transgenic plants andconfirmed to be operable, it can be introduced into other plants bysexual crossing. Any of a number of standard breeding techniques can beused, depending upon the species to be crossed. Further still, haploidorganisms (e.g. gametophytes) may be created following meiosis of thetransgenic organism. There are several organisms such as algae, fungiand some plants that are able to live at least part of their lifecyclein a haploid state.

Once the reporter TI/modified heterozygous organisms reach reproductivematurity, they can be crossed to each other, or in some instances,spores may be grown into haploids. Of the resulting progeny fromcrosses, approximately 25% will be homozygous modified/modified(NHEJ/NHEJ or non-reporter TI/non-reporter TI) at the target locus. Halfof the haploid offspring will contain the modification of interest. Themodified/modified organisms can be identified using any of the methodsdescribed above, including, but not limited to sequencing, PCR analysisand the like. These organisms will have the desired homozygous genemodification, but will not include any inserted exogenous reportersequences (e.g., markers).

Kits

Also provided are kits for generating organisms as described herein. Thekits typically contain polynucleotides encoding one or more nucleasesand/or donor polynucleotides (e.g., with selectable markers) asdescribed herein as well as instructions for analyzing selected TIclones for modifications at the non-reporter-TI allele and instructionsfor crossing the reporter-TI/modified organisms to each other togenerate organisms that are homozygous at the disrupted locus withoutany inserted donor DNA into which the nucleases and/or donorpolynucleotide are introduced. The kits can also contain cells,reagents, buffers for transformation of cells, culture media for cells,and/or buffers for performing assays. Typically, the kits also contain alabel which includes any material such as instructions, packaging oradvertising leaflet that is attached to or otherwise accompanies theother components of the kit.

Applications

The homozygously modified organisms described herein can be used for anyapplication in which KO organisms with inserted exogenous sequences arecurrently used. Such organisms find use in biological and medicalresearch, production of pharmaceutical drugs, experimental medicine, andagriculture.

For example, KO animals have proved very useful in analyzing thefunction of gene products and creating models for human diseases,thereby allowing drug discovery. Similarly, KO plants as describedherein can be used to create crops with the desired genes disrupted butwithout the inserted sequences that potentially could damage nativecrops. Thus, KO plants as described herein can be non-transgenic GMOs inthe sense that they do not include exogenous DNA. Alternatively, the KOorganisms can lack inserted sequences at the disrupted locus (loci) butinclude transgenes at another locus or loci.

Creating plants or animals that are homozygous for a transgene but lacka exogenous reporter sequence is often desirable. Thus, methods andcompositions described herein provide tools for generation of plants andanimals in which a desired (non-reporter) gene sequence has beeninserted into both alleles but the resultant plant progeny do notcontain any reporter sequences. For example, regulator sequences may beinserted to control (repress or activate) a specific gene of interest.Similarly, a transgene may be inserted into all the alleles of a locusof interest. The transgene may be inserted to knock out a particulargene where expression of the target gene is not desired.

The following Examples relate to exemplary embodiments of the presentdisclosure in which the nuclease comprises a zinc finger nuclease (ZFN).It will be appreciated that this is for purposes of exemplification onlyand that other nucleases can be used, for instance homing endonucleases(meganucleases) with engineered DNA-binding domains and/or fusions ofnaturally occurring of engineered homing endonucleases (meganucleases)DNA-binding domains and heterologous cleavage domains or TALENs.

EXAMPLES Example 1 Generation of Bi-Allelic Knockout Plants

ZFNs targeted to the IPK1 gene in Zea mays were used for targetedinsertion (TI) of a herbicide resistant gene (PAT). The ZFNs used and TItechniques are described in Shukla et al. (2009) Nature 459:437-441 andU.S. Patent Publication Nos. 20080182332 and 20090111188. As described,ZFNs precisely modified the target locus by mono-allelic or bi-allelictargeted integration of the selectable marker.

Subsequently, the TI/-(mono-allelic) clones (events) were genotyped atthe non-TI allele. As shown in FIG. 1, an event of the events sequencedat the non-TI allele had a NHEJ-induced mutation (deletion) at thenon-TI allele. FIG. 1 shows the wild type sequence and multiple sequencereads of the event. Such events are designated TI/NHEJ. TI/NHEJ eventsare then self pollinated by standard methods to obtain plants are thatare bi-allelic knockouts (−/− or NHEJ/NHEJ) at the targeted locus, butare devoid of the inserted reporter (selectable marker) sequence.

Example 2 Generation of Heterozygote Knockout Murine Stem Cells

ZFNs targeted to the murine histone H3.3B were used for targetedintegration of a fluorescent marker enhanced yellow fluorescent protein(EYFP) at the start of the 3′ untranslated portion of this gene. TheZFNs were constructed essentially as described in U.S. Pat. No.6,534,261. The recognition helices for the ZFN pair used as well as thetarget sequence are shown below in Tables 1 and 2.

TABLE 1 Murine H3.3B-targeted ZFNs SBS Design # F1 F2 F3 F4 7269 RSDHLSERNDTRKT QSSNLAR RSDDRKT (SEQ ID NO: 1) (SEQ ID NO: 2) (SEQ ID NO: 3)(SEQ ID NO: 4) 7270 DRSALSR TSANLSR RSDVLSE QRNHRTT (SEQ ID NO: 5)(SEQ ID NO: 6) (SEQ ID NO: 7) (SEQ ID NO: 8)

TABLE 2 Target sites for H3.3B ZFNs SBS # Target Site 7269cgCCGGATACGGGGag (SEQ ID NO: 9) 7270 gcCAACTGGATGTCtt (SEQ ID NO: 10)

A donor DNA was constructed containing the H3.3B gene operably linked tothe EYFP sequence (see FIG. 2). Briefly, a PCR fragment of genomic DNAfrom mouse H3.3B was cloned out of a genomic bacterial artificialchromosome (BAC) from C57BL/6J mouse chromosome 11 using Phusionpolymerase (NEB F-530L) and into a pCR2.1 vector (pCR2.1-H3.3B) usingTA-TOPO cloning (Invitrogen K4500-02). To generate the H3.3B-EYFP donorconstruct (pCR2.1-H3.3B-EYFP), a 6 amino acid (SRPVAT) linker followedby the open-reading frame of EYFP (Clontech) was inserted in-frame intothe last coding exon of H3.3B. The H3.3B-EYFP donor included no H3.3Bpromoter sequence, containing approximately 0.6 kb of 5′ homologousgenomic sequence starting at the second H3.3B codon, including introns,until the last H3.3B coding amino acid, followed by the linker and EYFP,a stop codon, and approximately 1.3 kb homologous to the H3.3B 3′UTR.The donor and the expression vector containing the ZFN pair were thenco-transfected into mouse embryonic stem cells. To deliver ZFNs anddonor constructs, mouse ES cells were transfected by Amaxanucleofection. In brief, immediately prior to transfection, ES cellswere feeder depleted by harvesting the ES cells, plating on afeeder-free dish for 30 min, and then collecting the ES enrichednon-adherent cells for transfection. 2-5*10̂6 cells ES cells wereresuspended in 90 μl solution, mixed with two non-linearized plasmids (1μg of ZFN plasmid with both ZFNs separated by a 2A peptide sequence +10μg of donor plasmid) in 10 μl nucleofection solution, and transfectedusing program A-013 as described in the Amaxa manufacturer's protocolfor mouse ES cells.

Following transfection, sterile plastic pipettes were used to transferthe cells, to warm ES media in tissue culture dishes that were alreadyprepared with feeders. After transfer, ES cells were cultured instandard conditions on treated feeders for 3-5 days prior to fluorescentactivated cell-sorting (FACS) or fluorescent colony picking. Followingcolony picking, clonal-isolation and expansion, genomic DNA was preparedusing the Qiagen DNeasy Blood & Tissue Kit (Qiagen 69504). Individualclones were screened by PCR. PCR products from both wild-type andmodified H3.3B alleles were sequenced using standard methods. To performSouthern blotting, genomic DNA was digested from wild-type and targetedES cells with BsrBI, and used a labeled 638 bp Avail fragment of theH3.3B donor as probe to visualize wild-type H3.3B and integrated H3.3Bdonors.

FACS and Southern blot analysis confirmed that the EYFP had beenintegrated into the H3.3B locus (see FIG. 6). Approximately 20% of theclones that contained an EYFP integrated in one H3.3B locus were foundto have had an NHEJ event at the other locus (see FIG. 7).

Example 3 Generation of Homozygote Knockout Mice

The stem cells containing the heterozygous reporter TI/modified allelesat the locus of interest are used to generate homozygousmodified/modified mice using standard protocols (for example seeManipulating the Mouse Embryo, A Laboratory Manual, 3^(rd) Edition Nagyet al, eds. Cold Spring Harbor Laboratory Press (2003)).

Example 4 Generation of Heterozygotic Mammalian Cells Containing aTransgene

Heterozygous cells are generated wherein'one allele of the PPP1R12c gene(see U.S. Patent Publication No. 20080299580) contains the PGK-GFP-pAselectable marker, and the other allele contains a transgene carrying anovel RFLP in the PPP1R12C gene that creates a Hind III restrictionsite. Briefly, K562 cells are transfected with the ZFN expressionplasmids as described above along with the two donor molecules. Onedonor comprises the reporter GFP driven by the PGK promoter, and theother donor comprises a PPP1R12C gene with the novel RFLP. GFP positivecells are isolated by limiting dilution and visual inspection. Clonesare grown up and genomic DNA is isolated for genotyping by PCR andsequencing.

All patents, patent applications and publications mentioned herein arehereby incorporated by reference in their entirety.

Although disclosure has been provided in some detail by way ofillustration and example for the purposes of clarity of understanding,it will be apparent to those skilled in the art that various changes andmodifications can be practiced without departing from the spirit orscope of the disclosure. Accordingly, the foregoing descriptions andexamples should not be construed as limiting.

1. An organism that is homozygous for a modification at a gene locushaving at least first and second alleles, wherein (i) the first andsecond alleles of the gene locus comprise one or more deletions suchthat the gene locus is inactivated using a nuclease comprising aTAL-effector domain and further wherein the first and second alleles donot comprise exogenous sequences; or (ii) the first and second allelesof the gene locus comprise one or more exogenous sequences, wherein theone or more exogenous sequences do not encode a reporter or selectablemarker.
 2. The organism of claim 1, wherein the first and second allelescomprise one or more deletions such that the gene locus is inactivated.3. The organism of claim 1, wherein the first and second allelescomprise one or more exogenous sequences not encoding a reporter or aselectable marker.
 4. The organism of claim 1, wherein the organism is aplant or an animal.
 5. The organism of claim 4, wherein the plant isselected from the group consisting of maize, rice, wheat, potato,soybean, tomato, tobacco, members of the Brassica family, andArabidopsis.
 6. A seed produced by the plant of claim
 5. 7. The organismof claim 4, wherein the animal is a mammal.
 8. A method of generating ahomozygous organism according to claim 1, wherein the homozygousorganism lacks exogenous sequences at a selected target locus, themethod comprising, (a) introducing an exogenous sequence into a cellusing a nuclease comprising a TAL-effector domain that mediates targetedintegration of the exogenous sequence into a selected locus of thegenome of the organism, the locus comprising at least first and secondalleles; (b) identifying cells comprising (i) the exogenous sequence inthe first allele of the target locus and (ii) a non-homologous endjoining (NHEJ) modification in the second allele of the selected targetlocus; (c) allowing the cells identified in step (b) to develop intoreproductively mature organisms; (d) crossing the reproductively matureorganisms to each other; and (e) identifying progeny that exhibit NHEJmodifications at the first and second alleles of the selected targetlocus, thereby generating a homozygous organism lacking exogenoussequences at the selected target locus.
 9. A method of generating anorganism according to claim 1, wherein the organism is homozygous forone or more exogenous sequences at a selected target locus of thegenome, wherein the exogenous sequences do not comprise a reporter orselectable marker at the selected target locus, the method comprising,(a) introducing a reporter or selectable marker sequence into a cell ofthe organism using a nuclease comprising a TAL-effector domain thatmediates targeted integration of the reporter into the selected targetlocus of the genome, the selected target locus comprising at least firstand second alleles; (b) introducing the one or more exogenoussequence(s) into the cell, wherein the nuclease mediates targetedintegration of the exogenous sequence into the selected target locus ofthe genome, (c) identifying cells comprising (i) the reporter orselectable marker in the first allele of the selected target locus and(ii) the one or more exogenous sequences in the second allele; (d)allowing the cells identified in step (c) to develop to reproductivelymature organisms; (e) crossing the reproductively mature organisms toeach other and; (f) identifying progeny of the cross of step (e) thatcomprise the one or more exogenous sequences in the first and secondalleles, thereby generating an organism that is homozygous for theexogenous sequences and lacking reporter or selectable marker sequencesat the selected target locus.
 10. The method of claim 9, wherein thereporter or selectable marker sequence and the one or more exogenoussequences are introduced concurrently with the one or more nucleases.11. The method of claim 9, wherein the reporter or selectable markersequence and the one or more exogenous sequences are introducedsequentially with the one or more nucleases.
 12. The method of claim 8,wherein the reporter sequence comprises a selectable or screeningmarker.
 13. The method of claim 8, wherein the one or more nucleases areintroduced as a polynucleotide.
 14. The method of claim 9, wherein theone or more nucleases are introduced as a polynucleotide.
 15. A kit forgenerating an organism according to claim 1, the kit comprising: (a) oneor more nucleases comprising that bind to a target site in the selectedtarget locus; (b) one or more exogenous sequence for targetedintegration into the selected target locus; and (c) instructions for:(i) introducing the nucleases and exogenous sequence into cells; (ii)identifying cells comprising one or more exogenous sequences areinserted into the first allele at the selected target locus; (iii)identifying cells of (ii) comprising a modification at the second alleleof the selected target locus; (iv) growing the cells of (iii) intoreproductively mature organisms; (v) crossing the organisms of (iv); and(vi) identifying progeny of the crosses of (v) that a homozygous for thetargeted gene modification.
 16. The kit of claim 15, wherein thenucleases are supplied as polynucleotides encoding the nucleases. 17.The kit of claim 15, further comprising an optional donor transgenecomprising sequences homologous to the target site, wherein the donortransgene does not comprise a reporter gene.